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Abstract 

It is sliown that a subelectronvolt upper limit can be derived on the neutrino mass from the 
CMB data alone in the ACDM model with the power-law adiabatic perturbations, without the aid 
of any other cosmological data. Assuming the flatness of the universe, the constraint we can derive 
from the current WMAP observations is ^ < 2.0 eV at the 95% confidence level for the sum 
over three species of neutrinos {niu < 0.66 eV for the degenerate neutrinos) by maximising the 
likelihood over 6 other cosmological parameters. This constraint modifies little even if we abandon 
the flatness assumption for the spatial curvature. We argue that it would be difficult to improve 
the limit much beyond "^rriu < 1.5 eV using only the CMB data, even if their statistics are 
substantially improved. However, a significant improvement of the limit is possible if an external 
input is introduced that constrains the Hubble constant from below. The parameter correlation 
and the mechanism of CMB perturbations that give rise to the limit on the neutrino mass are also 
elucidated. 
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I. INTRODUCTION 



The upper limit on the absolute mass of neutrinos is derived from the end-point spec- 
trum of tritium beta decay experiments. It is not easy, however, to push the limit to the 
subelectronvolt range. An alternative hope is to resort to cosmological considerations. The 
presence of massive neutrinos affects cosmic perturbations, most characteristically in a way 
to reduce the power in the small scale due to free streaming in the early universe. In a low 
matter density universe the effect is significant even if the neutrino mass is of the order of 
subelectronvolts Q], and constraints of a few eV as upper limits on the sum mass of three 
species of neutrinos are obtained from the power of galaxy clustering combined with the 
normalisation of the fluctuation power at large scales from the magnitude of quadrupole 
anisotropies in the cosmic microwave background (CMB) temperature field 0,1^, or from 
the shape of the power spectrum of galaxy clustering ||4j]. 

Massive neutrinos also affect perturbations in the CMB temperature field at intermediate 
to small scales in a less trivial manner (see j^, 0] for the earlier work). The effect here is 
via the modification of CMB perturbations, especially through the integrated Sachs- Wolfe 
effect, rather than simply the reduction of the power at small scales. Combining the CMB 
multipoles of WMAP with the galaxy clustering data of 2dFGRS, Spergel et al. Q derived 
^m,y < 0.7 eV: using the SPS S p ower spectrum, Tegmark et al. (8| give < 1.7 eV for 

' |. A general problem with the cosmological 



the sum mass; see also Refs. 



■Ja p p ower specti 



analyses is how the result depends on explicit or implicit assumptions and systematics, 
especially when two or more pieces of different types of data, such as CMB multipoles 
and galaxy clustering data, are combined. In this context it is an important question to 
ask whether one can derive a comparable limits on the neutrino mass from the CMB data 
alone. Tegmark et al.'s analysis shows that such a limit is not derived from the CMB 
data (WMAP data) alone, allowing for the possibility that massive neutrinos represent the 
entire dark matter at one sigma confidence level, whereas earlier Eisenstein et al.'s work [l^ 
seems to forecast the contrary. We consider that this is an important point that deserves 
further studies, especially in the view that the quality of the CMB temperature field data 
will be improved in the future, notably by the PLANCK in a half decade time, and it is a 
consequential question if one can improve the limit on the neutrino mass without resorting 
to the large-scale galaxy clustering data, for which we always have a suspect for unknown 
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biasing and not well-controlled nonlinear effects. 

It is also important to understand whether the limit depends upon the assumption of the 
exactly flat spatial curvature of the universe, as customarily assumed when the constraints 
on neutrinos were discussed. We already know that the curvature is quite close to flat, 
but the possibility of a slight departure from the flatness is not excluded. For instance, 
the derivation of the consistent Hubble constant from CMB alone depends crucially on the 
flatness assumption: a slight departure, say by 2% in the spatial curvature, largely modifies 
the "CMB best value" of the Hubble constant to an unacceptably small value. We see some 
reason that a small neutrino mass may give an effect similar to non-flat curvature and thus 
the two effects might cancel, loosening the limit. 

In this paper we investigate the problem within the ACDM universe with adiabatic per- 
turbations whether a sensible limit on the neutrino mass can be derived from the CMB data 
alone, and if this is the case how does the limit depend upon the assumption of the exact 
flatness of the universe. A particular emphasis is given to elucidating the parameter correla- 
tion and the mechanism in the CMB perturbation theory as to how the neutrino mass limit 
is derived. In our argument we extensively use the "reduced CMB observables" , the position 
of the first acoustic peak ^i, the height of the first peak normalised to the low £ value Hi, 
the height of the second relative to the first peak H2, and the height of the third relative to 
the first peak H^, introduced in Hu et al. 15], and study how the massive neutrinos affect 
these variables. 

We assume that the three neutrinos have a degenerate mass. This will be a realistic 
assumption if the neutrinos have masses close to the upper limit that concerns us, because 
the neutrino oscillation experiments tell us that the differences of masses are much smaller 
than the upper limit. In our numerical work we ignore the tensor perturbations, but we 
argue that their inclusion would only tighten the limit on the neutrino mass. We assume 
that the cosmic density perturbations have a power spectrum specified by index n^. A small 
departure from the power spectrum as predicted by slow-roll inflation does not change our 
analysis. If the departure is at a large amount, such as that indicated by the WMAP team 
combining their CMB data with the galaxy clustering, our result will need modification: in 
such a case one cannot argue for the limit on the neutrino mass unless the primordial power 
spectrum is given. 

In the next section, we show with the numerical work that we can derive a sensible limit 
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FIG. 1: Minimum as a function of the neutrino energy density ujy. The solid curve is for the flat 
universe. The dotted and the dashed curves show the cases for a negative and a positive curvature 
universe, respectively. 

on the neutrino mass from the CMB data alone under the assumption of the exact spatial 
flatness of the universe. In Sec. lIIII we consider the effect of massive neutrinos on the reduced 
CMB observables, and discuss how one can obtain the constraint from the CMB data alone. 
In Sec. II VI we discuss the physics of the response of the reduced CMB observables to massive 
neutrinos in CMB perturbation theory. In Sec. |Vj we consider the constraint in non-flat 
universes, and show that a comparable constraint is derived. The conclusion is given in 
Sec. EH 



II. LIMIT ON THE NEUTRINO MASS FROM WMAP ALONE 

The parameters of the ACDM model we shall consider are the baryon density uji, = fifo/i^, 
the matter density Um = ^mh"^ (which includes baryons but excludes neutrinos), the Hubble 
constant h, the reionisation optical depth r, the scalar spectral index of the power-law 
adiabatic perturbations, and overall normalization A, where Qi denotes the energy density 
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FIG. 2: The cosmological parameters for the solutions that give minimum as a function of ixiv 
The two hne segmants shown in panel (c) are the cases for a negative (dotted line) and a positive 
(dashed line) curvature universe. 
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TABLE I: Solutions for Xmm(^'^)- 

in units of the critical density and h is Hq = 100 h km s~^Mpc~^. We ignore the tensor 
perturbations. We define the normahsation parameter hj A = i{i + 1)0^^/271 in units of 
/iK^ at £ = 2, which differs from the WMAP definition. In addition, we include the neutrino 
mass density uj,, = Q^h"^, which is related to the neutrino mass as 

Em,, 



(1) 



94.1 eV 

We assume three generations of neutrinos with their masses being degenerate, m^^ = ijij^^ = 
rriy^, so rriy = 31.4 u^, eV. The vacuum energy is taken to satisfy the flat curvature Qtot = 
Q\ + + f^jy = 1, but this condition is relaxed in Sec. We often write uja = VL\h'^. We 
run CMBFAST [16] to calculate CMB multipoles for the total of 1 x 10^ sets of parameters 
in the course of our work. The ^'^^ computed for the entire temperature (TT) and 
polarisation (TE) data set of WMAP (899 and 449 points, respectively) using the likelihood 
code supplied by the WMAP team |l7, 18, 3- 

We search for the minimum for flxed cj,^, and refer to the resulting minimum for 
a flxed uOy as x^mi'^v)- We prefer to use a deterministic search for the minimum rather 
than the Markov chain Monte Carlo (MCMC) method that is popular in the recent work 
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FIG. 3: The values of reduced CMB observables for the solutions that give minimum ^ 
function of cUu. 



for the CMB analysis, since we find that the latter, while it is in principle efficient to find 
a gross global structure of the likelihood function, often fails to yield the accurate shape of 
the likelihood function away from the minimum unless the chain is long. 

To search for the minimum in 6 parameter {uf,, Um, h, r, n^. A) space, we adopt a 
nested grid search. Technically, we apply the Brent method j20] of the successive parabolic 
interpolation to find a minimum with respect to one specific parameter with other parameters 
at a given grid, and successively apply this method to remaining parameters to find the global 
minimum^. We describe more details of this minimisation procedure in Appendix A. If more 

^ The initial range of the parameters we searched is wide, e.g., < h < 1 and JIa > etc. Note that the 
priors do not play any important role in our grid search, unlike in MCMC where the priors are crucial. 
Should we find the parameter region near the boundary that results in a meaningfully small snd 
contributes non-negligibly to the likelihood function, we would simply enlarge the parameter region for 
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TABLE II: Comparison of the solution for the massless neutrino with those given by Spergel et al. 
and Tegmark et al. The errors stand for one a confidence level. 
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TABLE III: Parameters for the two minima for = 0. 



than one conspicuous minima are detected in the process, we apply this method to each 
local minimum. We run the CMBFAST code typically 10^ times to find the global minimum 
for a given u^. Note that the adoption of the Brent method greatly reduces the number of 
grids needed for a required accuracy. 

In order to obtain the likelihood function with respect to a specifc parameter, we must in 
principle integrate over the parameters other than the one that concerns us. The function 
could be different significantly from the true likelihood function, if the distribution is not 
Gaussian. To verify this point, we carry out an adaptive Monte Carlo integral using the Vegas 
code [21] to check if the likelihood function inferred from the function differs significantly 
from that obtained by integarting over parameter space. The integral is performed for the 
cases of cjj^ = and 0.08, the latter being the value with which Tegmark et al. give a rather 
high likelihood. In particular, we want to check if a local minimum that gives a relatively 
large is favoured from a large measure of parameter space. In so far as we have examined, 
there is no evidence that the likelihood inferred from function differs significantly from 
that obtained from the integral (examples are shown below). In particular, we do not find 
the case where the integration measure overcomes an excess x^- the parameter sets that 
give the global x^ minimum always represent the maximally likely parameters in the case 
we studied. 

search. This never happened in our case, ho-wever. 
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The solutions that give a minimum for a given uj^, are presented in Table HI The Xmin 
as a function of the neutrino mass density is shown in Figure ^ and the 6 parameters of 
the solution for each neutrino mass density, also given in the table, are displayed in Figure 
2. The corresponding four reduced CMB observables (defined below) are shown in Figure 3 
for the use in the next section. 

We first note that the 6 parameters for uOy = agree with those of the comparable 
solutions of Spe.gel et a.. Q and T.g.arU et a.. ^ within one sign.a e„o. (see Tabled, 
verifying that our minimisation to find the global minimum works at least as good as the 
MCMC method they used. In fact, the overall we attained is appreciably smaller than the 
two authors' for the same set of input data (Xspergei " xLs = 2-4, and X^gmark - xLs = 2-9). 
We may ascribe this to a finer grid of the parameters close to the minimum in our work. We 
find bimodal structure of the surface, most clearly visible for Us and r that are strongly 
correlated to each other ^. The two minima are found at Ug = 0.973 and = 1.21 with 
the second minimum having a slightly larger x^, x^(^s = 1-21) — = 0.973) = 0.2, or 

the relative likelihood of 1.1: see Table ITTTl The two parameter sets are disjoint by a hill 
with a height more than one a. The Vegas integration over multiparameter space centred 
on the two extrema indicates that the former minimum is favoured over the latter by the 
ratio of 1.3 in terms of the likelihood value. That is, likelihood from the estimator is 
a good approximation to the 'true' value obtained by marginalising the parameters, i.e., 
even in this case where the distribution is deviated from Gaussian the function is likely 
a reasonable approximation of the likelihood function. Furthermore, we observed that the 
one-parameter distributions with respect to the other five parameters are close to Gaussian 
once we require Ug to be around the peak at = 0.973 (see Appendix B). This suggests 
that the distribution in multidimensional space is likely not far from the Gaussian. Hence, 
we infer that the statistics well approximates the reality. 

The bimodal structure we find is consistent with what was found by Tegmark et al., 
but our likelihood of the second minimum is much higher than that reported (the ratio of 
likelihoods between the two extrema by Tegmark et al. is 2.5). We suspect that Markov 
chain of Tegmark et al. does not sample well around the second minimum. This point is 
demonstrated in more detail in Appendix B. This is an example that the current application 
of the MCMC does not give an accurate likelihood function away from the global minimum. 
Of course, the second solution is an unphysical one in the sense that it is allowed only at 
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the cost of an unacceptably high reionisation optical depth (r ^ 0.5); the solution is deleted 
with some prior on r. The resulting parameters Ub, and h are also deviated significantly 
from the values derived from other observations. 

In Figuren]we observe that Xmin('^i^) increases with the neutrino mass density. The curve 
of minimum is close to a parabola except in the immediate vicinity of 0;,^ = 0. Taking 
Ax^ = Xmini^!^) ~ Xmin = 4 to sct an upper limit on u^, at the 95% confidence level, we 
obtain 

< 0.024, or < 0.75 eV. (2) 

Since the likelihood function with respect to Ug and cUy, C = exp[—Ax'^{ns,u!u)/2], which 
is constructed by minimising the five other parameters, is visibly deviated from Gaussian, 
we integrated it over and then over u^. This yields the 95% confidence limit 

< 0.021, or < 0.66 eV, (3) 

which is close to Eq. 0, a simple reading from x^. [The difference primarily comes from the 
second peak of the function, which is ignored in Eq. Q]. If the distributions of the five 
other parameters are close to Gaussian, a two-dimensional integral is sufficient to obtain an 
accurate likelihood. 

We cannot compare this limit on the neutrino mass directly with those derived in Spergel 
et al. and Tegmark et al. 0], in which those authors used the galaxy clustering data as 
additional inputs. On the other hand, the latter authors claim that WMAP alone does not 
give a limit on the neutrino mass and that the massive neutrinos can make up 100% of 
dark matter at about one a confidence unless galaxy clustering data are used. Our result 
contradicts this. We do not find a parameter set that gives acceptable for the neutrino 
mass density beyond the limit. Furthermore, the measure of the parameter space does 
not seem to increase for a larger u^. Our Vegas integrals give a relative likelihood between 
uj,, = and uJu = 0.08 to be 7 x 10"^, which is consistent with the estimate from our x^ curve 
5 X 10~^, whereas Tegmark et al.'s value is 0.6. We suspect that sampling of the Markov 
chain of Tegmark et al. does not give an accurate likelihood function away from rrii, = 
that is the global minimum, as similarly happened with the case of discussed above and 
in Appendix B. In particular, we do not find a mixed-dark-matter-model {Qm + ^u = 1) like 
solution: the CMB multipoles of the hot dark matter model with some sets of parameters 
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are visibly similar to the observation y\, but a closer inspection shows that is always 
unacceptably large, given a high accuracy of the WMAP data^. In the following section we 
see a reason how can one obtain the limit on the neutrino mass density from the CMB data 
alone. 

We remark that the current WMAP TE data do not seem to play a significant role in 
deriving our limit, as we find in separate runs of the minimisation using only the TT 
data^: the curves differ little between the two cases. This somewhat differs from the 
forecast of Eisenstein et al.J^] who indicated a tighter error allowance that would result 
with the WMAP polarisation data^. 

As a final remark, the two minima found for = persist up to u^, ~ 0.04, but the 
one that corresponds to the "unphysical solution" disappears for uj^ > 0.05. 

III. THE REDUCED CMB OBSERVABLES AND THE NEUTRINO MASS 

A. The reduced CMB observables and the goodness of the ACDM fit 

Following Ref. we focus on four quantities which characterise the shape of the CMB 
spectrum: the position of the first peak ii, the height of the first peak relative to the large 
angular-scale amplitude evaluated at £ = 10, 



^ For the set of parameters of a mixed-dark-matter- model like solution proposed by Elgar0y & Lahav |9j, 

we find = 1482, which is larger than that of the ACDM solution by Ax^ — 50. We cannot make 

significantly smaller around this solution. 
^ We somewhat loosened the convergence criteria for these runs, but we still obtained Xmin = 972.3 compared 

with 972.4 of Tegmark et al. The solution differs appreciably from that with the full data set only in r, 

which for the TT case is close to zero. 
^ Their forecast 2 a errors are 1.2 eV with the polarisation data, and 1.8 eV without them for a hypothetical 

neutrino mass of 0.7 eV assuming idealised CMB data of the Gaussian variance around the prediction of 

the ACDM model. This does not contradict our actual limit. 




(4) 



the ratio of the second peak height to the first 




(5) 
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and the ratio of the third peak height to the first, 

where (AT;)^ = /(/ + l)Cf'^/2TT and C^'^ is the multipole coefficient of the temperature 
anisotropy. 

Taking the advantage that we generated one milhon CMB templets, we estimate the 
reduced CMB observables from the envelope drawn by the entire set of the templets. Our 
sampling is dense enough to define the correct envelope at least for small Ax^ that concerns 
us. The result is 

ii = 220^i■^ (7) 
Hi = 6.7l°i (8) 
H2 = 0.449 ±0.007, (9) 
H, = 0.461^2, (10) 
which is shown in Figure 4. The error is 1 standard deviation obtained by halving the range 
that gives 2 a error, i.e., A^^ = ^ xLin = 4, because the structure of the curve is not 
always parabolic at around Ax^ ~ 1. The central values are the best fit solution given in 
Table 1. Eqs. ^ and (fTUj) are consistent with the values Tegmark et al. ^ quoted for their 
best parameter set [Hi is not given). We note particularly small errors for ii and H2, which 
play an important role in the argument given in the next subsection. In addition, we draw 
the envelopes for the case of a few non-zero neutrino masses. They give increasingly larger 
as the mass increases, in particular for uJi, > 0.02; the widths of the valleys become 
somewhat narrowed as uj^, increases. 

We also attempt to obtain the four reduced CMB observables from the fits th at g ive a 
x"^ minimum for a restricted range of i using our CMB templets, as was done in [1^. We 
calculate using the TT data of appropriate multipole ranges. We use 75 < / < 375 for 
£1, 7 < / < 375 for Hi, 75 < I < 375 and 450 </ < 600 for H2, and 75 < / < 375 and 
750 < / < 875 for H-^. The results are displayed in Figure 4 above. We obtain 

ii = [ 219, 222 ], (11) 

Hi = [ 6.5, 7.9 ], (12) 

H2 = [ 0.430, 0.452 ], (13) 

H3 = [ 0.362, 0.488 ]. (14) 
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The numbers bracketed are the 1 a range obtained by halving the 2 a range of the xfocai 
curve^. An inspection of the fits of the templets to the obeserved CMB multipoles indicates 
that the data are well represented by those templets. Figure 4 may give the impression that 
the curves do not agree with those obtained from the envelope of the global ACDM fit: 
the valley of the curves is generally wider, and the positions of the bottom of valley is 
somewhat shifted; the xfocai hesi global fit solution is larger than the best local fits 

by Ax^ 2. The central values of Eqs. (|7j) to (fTUI) . however, fall in the one a range of 
Eqs. (fTT|) to (HH)^. Our analysis, showing that the best global fit and the local fits resulted 
in the consistent reduced CMB parameters within 1 a, leads us to conclude the goodness of 
the ACDM fit. 

For the consideration given in the next subsection, where we are concerned with the 
problem how much massive neutrinos increase for the CMB data relative to the uj^ = 
solution, we should use Eqs. (fTT|) - ((HI), rather than Eqs. ((Tj) - dTUj) . which are obtained by 
restricted parameter searches. 



B. Reduced CMB observables and the neutrino mass 



We calculate the response of the observables Oi = ii, Hi H2 and with respect to the 
variation of cosmological parameters Xj, i.e., the partial derivatives dOi/dxj, around the 
global best fit, following Ref. We vary the parameters typically by ±50% with a step 
of 5% and take the difference from the reference values. We find that the responses are 
quite linear against the amount of the variations of the 6 parameters. The exception is the 
response to the neutrino density parameter, for which it is shown separately. The resulting 
partial derivatives are: 

A£i = 16 25 47— + 36 \- jAeA^u), (15) 

uub ujm n Us 



^ The ItT range of Hi depends on the choice of the lower hmit of the ^-range. It is well known that 1 — 2 and 
3 multipoles are anomalously low compared to the expectation from the ACDM model. If the lower limit 
is set to ^min = 2, thc one a range will be Hi = [7.0,8.0]. The 1 a range nearly converges for tn\in > 3: 
the central value does not differ from K g Ijl more than 0.1. 

^ The parameters derived by Page et al. l22l| , who extracted them by fitting the WMAP data by Gaussian 
and parabohc functions, £i = 220.1 ± 0.8, H2 = 0.426 ± 0.015, and H^ = 0.42 ± 0.08 [Hi is not given) 
deviate from our ACDM solutions in Eqs. O to (|10|l by up to 1.5cr, but agree with those given in Eqs. 
to (EH). 
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FIG. 4: Constraints on the four reduced CMB observables. Local is computed using restricted 
sets of multipoles as explained in the text and is measured with Xkjcai relevant range indicated 

in the left ordinate. The of global solution is measured for the value with respect to the entire 
data set as measured in the right ordinate. The relative normalisation is fixed so that the global 
solution that gives minimum gives the local value measured in the left ordinate. Dotted 
curves are the envelopes for uji^ = 0.01, 0.02 and 0.03 in the order of increasing minimum x^- The 
horizontal dashed line segments show the position of ~ Xmm ~ ^■ 
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Here fitot = 1 is kept fixed, and fAo^{^^u) stands for the variation with respect to the neutrino 
mass density. The responses of and to h, and those of ii, and H^, to r are small, 
so they are omitted in the expressions. Page et al. (2^], evaluated the partial derivatives 
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for H2 and to the variations of ujb, and Us for the WMAP data using the analytic 
expressions Our empirical derivatives for these quantities are consistent with their 

analytic evaluation. 

We draw the response of Oj against the variation of for the range cu^ = to 0.04 
in Figures. 5. Note that an increase in uj^, accompanies a decrease in Q/<^ as we keep 
Qtot = 1 and = t^cdm + fixed^. We see small glitches from = to the neighbouring 
point in Figures. 5 (b), (c) and (d). This is probably a numerical artefact caused by the 
implementation of the massive neutrino subroutine in the CMBFAST code, and we ignore 
these glitches since they are much smaller than the errors of the CMB data. 

^ Which variables are to be used is merely a matter of the convention. We chose the ones with which the 
effect of massive neutrinos is more clearly visible. 
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We observe that the response of the four observables against uoy changes at around ~ 
0.015. As ujj^ increases beyond it, the decrease of ^l becomes gentle; the Hi, which increases 
with uji, up to uJi, = 0.017, turns to decrease. H2 and H3 change httle (< 0.5%) between 
tUjy = and 0.015, but then begin to increase. We can understand this turning point as 
the competition between neutrino free streaming and recombination. Neutrinos become 
non-relativistic before the recombination when uj,y > 0.017 and they become non-relativistic 
after the recombination when u;^ < 0.017. We can show that the behaviours, at least for ii 
and Hi, are quantitatively understood by simple analytical considerations, but let us defer 
this problem to the next section. 

Here, we are concerned with the problem how the constraint on the neutrino mass is 
obtained from the CMB data alone, given observational and empirical information of ii, Hi, 
H2 and H3. We argue that we cannot derive a constraint for < 0.017 but an upper limit 
likely exists at some neutrino mass in the region Ui, > 0.02. 

We first consider Ui, < 0.017. In this regime, as seen in Figure 5, ii decreases and Hi 
increases while H2 and H3 change little with increasing uJiy. The change induced by Ui, in 
ii is significant, but according to Eq. (|THjl it is cancelled to a large degree by a decrease of 
h [Fig. 2 (c)], as seen in Figure 3 (a). For u^, = 0.015, say, we need h to decrease from 0.69 
to 0.58, but this change is harmless. The decrease of h, however, causes an increase of Hi 
(see (HH)) in addition to the direct increase due to u^. The increase of Hi is cancelled by 
decreases of Ug and uJb, whereas those two decreases tend to cancel in H2 and H3. The error 
allowance of Hi is large enough that a good cancellation is not required, and hence it is 
easy to make the induced changes of H2 and H^ cancelled to within their error allowances. 
The large error of Hi arises primarily from the cosmic scatter, a ~ ^2/{2i+l), in small i 
modes (which we estimate to give 6H1 ^ ±0.5); so it seems unlikely that it will be reduced 
greatly in data expected in the future. Therefore, unless external observational data are 
introduced we cannot derive a constraint on the neutrino mass density for uj,j substantially 
smaller than 0.017, consistent with the flat dependence around cjj, = in Figure 1. This 
will remain to be true even if the quality of the CMB data is improved. 

When uJi, > 0.017, massive neutrinos contribute to increase H2 and H^ as seen in Figures. 
5(c) and (d) in addition to a further decrease of ii. Looking at Eq. (fTTj) and Eq. (fTHj) . the 
increase in H2 and H^ due to massive neutrinos may be compensated by either increasing 
Ub or decreasing Ug. Actually, as shown in Figure 2 (a) and (e), the decrease of Ug occurs 
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to minimise x^- This is owing to a steeper increase of than H2 with the increase of 
oju'- AH^/AuJi, > AH2/ALJ1, in Figure. 5 (c) and (d). Such increases are more efficiently 
compensated by the decrease of than by the increase of Uh, as read from Eqs. (jl7|) and 
fITHll . which indicate that AH^/Aris > AH2/ Aris whereas lAifg/Acj;,! < jAi^a/Au;,,!. [NB: 
uJijiuy = 0.017) — ujh{uj,, = 0) is negative for the reason discussed in the above paragraph, 
but turns to increasing for Ui, > 0.017 to collaborate in the requirement.] In other words, 
massive neutrinos enhance the multipoles more on smaller scales (larger i), which causes an 
effect similar to the increase of than the decrease of Ub, which increases even peaks more 
strongly. 

In passing, it is worth noting that and Ug are negatively correlated in this argument, 
in contrast to the naive expectation of the positive correlation from the effect of massive 
neutrinos that diminish the small scale power. The latter implies that the limit on the 
neutrino mass loosens for increasing Ug (e.g., The CMB argument works in the opposite 
way. 

The cancellation of the effect due to in the acoustic peaks by decreasing Ug increases 
the large-scale amplitude significantly, as is manifest in a large coefficient of Aus/us in 
Eq. (fTBj) . With a tight error allowance for H2 the decrease of rig compels Hi to decrease 
largely, as seen in Figure 3 (b), and to push down Hi below the allowed error range {Hi > 6.2 
at 1.5 cr) at around ~ 0.02, while H2 and H3 stay within the boundary of errors given in 
Eqs. (fT3|) and (fT4|l . This corresponds to the upper limit of we obtained, i.e., 00^^ < 0.021 
(at 95%), in a numerical study of the test. 

Let us visit briefly the possibility of varying r to increase Hi. From Eq. (fT?)|) . a large 
decrease of r would make it increase without disrupting ii, H2 and H^. However, r can not 
be reduced as much as one wants, as displayed in Figure 2 (d). The observed high amplitude 
at the lowest multipoles of the TE mode needs a non-negligible amount of the reionization 
optical depth. 

We may also ask whether the inclusion of the tensor perturbations change the limit. Hu 
et al. [3| give 

AHi ^ -5 (19) 

^ l + 0.76rt ^ ^ 

where = 1A[ATiq^ / ATip]'^ is the tensor to scalar ratio at i = 10. This means that the 

inclusion of the tensor mode collaborates to lower Hi, and thus only tightens the limit on 



17 



the neutrino mass density. 

These considerations show that one can derive the hmit on the neutrino mass density 
of the order of u^, ~ 0.02 from WMAP data alone. They also show that the limit may be 
improved to uj^ ~ 0.017 with the use of improved CMB data, but it would not be easy to go 
beyond. Even with the extremely high precision data anticipated from PLANCK, the limit 
we expect will be uj,j < 0.013 at the 95% confidence level^: the increase of is very slow 
for uj^ < 0.01^ 

The efficient way to improve the limit on u^, is to introduce observations that constrain 
the Hubble constant, either directly or indirectly, from below. This is because the most 
prominent effect caused by light neutrino is to change the position of the first peak and it 
is absorbed into a lowering shift of the Hubble constant. Should one require that Hq > 65 
km s~^Mpc~^, a significantly stronger limit of the order of u;^ < 0.01 would be derived even 
with the current CMB data^°. 



IV. ANALYTIC CONSIDERATIONS ON THE EFFECT OF MASSIVE NEUTRI- 
NOS 

A. The position of the first peak 

Here, we attempt to understand the effect of massive neutrinos on the reduced CMB 
observables. We may take the epoch when the neutrino of mass m,^ becomes nonrelativistic 
as its momentum ~ m,^, i.e., T,^_nr = ni^/S. The corresponding redshift is 

1 + Znr = ^ (20) 

= 1.99 X 10^(m^/eV) (21) 
= 6.24 X lOV, (22) 



® In this estimate we use the assumed CMB data that lie around the best ACDM solution for the vanishing 
neutrino mass with the error being the cosmic variance. We used our data base to search for the 
minimum. 

^ Kaplinghat et al. . 23] proposed to use the deflection angle power spectrum from weak gravitational lensing 
to give a stronger constraint on m^. We do not take this into accout in the present consideration. 
With this lower limit on Hq, the global minimum is given by the unphysical solution that gives 
unreasonably large reionisation optical depth. Our statement in the text excludes this possibility. 
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where ^m,^ = SrUi, is used for the last equahty. This is compared with the redshift at 
recombination Zrec = 1088 7], which is insensitive to the mass of neutrinos. Neutrinos 
become non-relativistic before recombination, i.e., Znr > Zj-ec, if 

uJu > 0.017, (23) 

but otherwise they remain relativistic and freely stream till post recombination epochs. 
This uji, corresponds approximately to the turning points of the curves of ii, Hi, if 2 and if 3 
observed in Figure 5. 

We denote the energy density in the form uj = Qh"^ = ph^ / Pcr,0) where the critical density 
Pcr.o = SMpiifp with the Planck mass defined by the gravity scale = l/87rG', and the 
subscript expresses values at the present epoch. The matter and photon energy densities 
are 

pm{.a)h^ / Pcrfi = C^m.O ( — ) , P-f{.a)h? / Pcrfi = ^^7,0 ( — ) , (24) 

V '^0 / V '^0 / , — , 



where the present photon energy density lUj^ = 2.48 X 10-5 for T,,o = 2.725 K |2|. The 
neutrino energy density is 



4/3 

'■'7: 



(^^^ ^/^^Tfx\e^ + l)-^dx, (25) 



where 

y = m,(ll/4)V3(a/ao)T-o\ (26) 

and X is the normalised momentum variable and three flavours of neutrinos are assumed to 
have a degenerate mass. The vacuum energy is 

pk{o)h^/pcvfi = ^A,o (27) 
= h? — Umfl — uj^fi, (28) 

for the flat universe. The total energy density is ptot = Pm + P-y + Pu + Pa- With ptot, the 
cosmic expansion rate H = a/a is given by = ptot/3Mpj, which is used to evaluate the 
conformal time r], 

, ^ [ dt da' , , 

The position of the m-th peak l^n is determined from that of the acoustic peak ^A and 
the phase shift 0m, which depends weakly on m 

C = ^A("^-0m), (30) 
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where the acoustic scale is defined by 



TT- 



(31) 



with rs{rirec) the sound horizon at the recombination epoch and r0{ri,.ec) is the comoving 
angular diameter distance to the last scattering surface, re(?7i.cc) = ^70" Vrec in the fiat 
universe. The sound horizon is given by 



''^"^ , r , da' 



rs{a) = I Csdr] = / Cs{a')-—, (32) 



'0 JO 

where the sound speed = (1/3)(1 + R)^^ with R = 3pb/4:p^ = 3auJbfl/4:U^^o. The Cg 
depends only on photons and baryons, and the effect of neutrino masses enters into the sound 
horizon only through the modification of the expansion law. The phase shift in Eq. ()3U|) 
arises from the decay of gravitational potential due to radiation growth suppression when 
the universe is not fully matter dominated, which later modifies th e g^r avitational redshift 
that the photons would otherwise suffer from the Sachs- Wolfe effect l25|. This is sometimes 
called the early integrated Sachs- Wolfe effect. The evaluation of the integral gives ~ 300, 
which is considerably larger than the physical position of ii. the difference is ascribed to 
the phase shift (pi, which is estimated in what follows. 

The enhancement of the amplitude for scales between the first acoustic peak and the 
horizon crossing at the matter domination due to the early integrated Sachs- Wolfe effect 
makes the first peak formed at a scale larger than the acoustic peak. An accurate evaluation 
of the phase shift requires the full solution of the coupled Boltzmann equations. Instead, 
we use the fitting formula given in Ref. 



3, 

/r N 0.1 



.0.3. 

where rj-ec is the radiation-to-matter energy ratio r = p^/ Pm at the recombination. The 
appearance of the radiation-to-matter energy ratio as the prime variable is motivated by the 
physics of the integrated Sachs- Wolfe effect Q]. Precisely speaking, this fitting formula is 
given for massless neutrinos, but it is expected to be valid for massive case provided that rj-ec 
is modified appropriately, because the effect of massive neutrinos on the integrated Sachs- 
Wolfe effect is primarily through the change of Tree- A larger radiation-to- matter energy 
ratio leads to a larger enhancement and hence a larger phase shift as indicated by Eq. ()33p. 
Massive neutrinos with u^, > 0.017 act in a way to suppress this effect. 
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The ratio Tree in the presence of massive neutrinos is calculated as follows. We take 
neutrinos that have momentum larger than as radiation, and those having smaller mo- 
mentum as matter. Accordingly, we split pi, into the radiation component pi,^r and the 
matter component p^^m, as 



TT- \aoJ Jo 

by dividing the integration range at the value in Eq. (jSHI)- The radiation-to- matter energy 
ratio is calculated as 



which is used to compute 0i in Eq. (jSSl)- 

The first peak position thus calculated as a function of is shown in Figure IHl together 
with the curve from the full numerical computation presented earlier. The agreement is 
excellent, validating the prescription described here. For a reference we also draw the case 
where the phase shift is fixed at the zero-neutrino-mass value, (1 — (pi) ~ 220/300. This 
curve agrees with the accurate result for small neutrino masses, but starts deviating from 

^ 0.015, i.e., when neutrinos become nonrelativistic before the recombination epoch. 
This stands for the error that we count neutrinos as radiation even when they are non- 
relativistic at the recombination, and hence, overestimates the early integrated Sachs- Wolfe 
effect, so does the phase shift 0i. This consideration demonstrates that the change of the 
slope in £i at ^ 0.017 is a result of the reduction of the early integrated Sachs-Wolfe 
effect by the neutrinos that become nonrelativistic before the recombination epoch. 

B. Hights of the acoustic peaks 

It is known that free-streaming of massive neutrinos causes a larger decay in the grav- 
itational potential $. This drives the acoustic oscillation of the baryon-photon fluid more 
strongly, so that the amplitude of temperature anisotropics within the free-streaming scale 
is enhanced through the monopole term 6o + \1/ in the harmonic expansion of the temper- 
ature perturbations The conformal time corresponding to the free-streaming scale is 
calculated as r/nr = vi^^nv) where a^v is known from Eq. (j22I)- This is the distance over which 
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py,r{a)h^/PcT,o 





(36) 
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FIG. 6: Dependence of ii on uji, calculated from Eq. ()3U() (dashed line), as compared with the 
accurate numerical solution (solid line) . The dotted line shows the case when the effect of massive 
neutrinos on the early Sachs- Wolfe effect is ignored. 



relativistic neutrinos move freely. The multipole i^v corresponding to this scale is 

^ 27rre(?7rec) 

Vnr 



(37) 



which we show in Figure [7| for uJm,o = 0.14 and h = 0.69, and ^rec = 1088. The multipole 
amplitudes for i > l^j. are affected by free streaming. For ujy > 0.017, the amplitude on the 
scale £ > 300 is enhanced . This means that only the second and higher peaks receive the 
effect. 

The first peak receives little the effect of the decay of gravitational potential, and the 
variation of Hi with ujy is understood by a simple consideration. The gentle increase of 
Hi for ujv < 0.017 is understood by the decrease of to compensate the neutrino energy 
density in the fiat universe and an associated decrease of the integrated Sachs- Wolfe effect 
from the late domination of A, which enhances Ciq. The effect continues to the region 

^ 0.017, but in this regime massive neutrinos act as the nonrelativistic dark matter at 
recombination and the effect from the increase of the amount of matter overcomes; hence 
Hi begins to decrease [Aifi/Aa;^ < as seen in Eq. (fTB|]. This indicates that uiy ~ 0.017 
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FIG. 7: Multipoles corresponding to the neutrino free-streaming scale. 



is again the turning point, as we saw in Figure 5. In what follows we verify this reasoning 
by a more quantitative argument. 

Our strategy is to reduce the theory with massive neutrinos to an effective, mock theory 
without massive neutrinos, for which we have an established understanding [2^. If 
neutrinos are light they are taken as radiation, and if heavy, they are regarded as matter. 
For ojy ~ 0.017, they contribute as both matter and radiation, and are handled by splitting 
the neutrino energy density into the radiation and matter parts as in Eqs. ()34|) and (jH^jl . 
We count the matter part of neutrinos at the recombination as additional "CDM" . We then 
have the effective matter density, 

Pu,m ('^rec) 



i^m,0 + 



(38) 



where Orec = 1/1089; see Figure 8 (a). 

In order to mimic the true matter-radiation equality epoch in the theory without having 
massive neutrinos, we try to vary the effective number of neutrino species N,^. This ensures 
nearly the same amount of the early integrated Sachs- Wolfe effect generated in the massless 
neutrino world. The scale factor at the equality function of uj,y is calculated from 

the condition ^{aeq) = 1 where ^ is defined by Eq. (j^Sl)- The result is shown in Figure 8 (b) 
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From the conventional calculation giving 1 + z^q = a^^ = 80950a;m/(2 + OAbiN^) for 
= 0, the effective N^, we want is 



which is shown in Figure 8 (c). 

We also want to adjust so that the integrated Sachs- Wolfe effect in the A dominated 
epoch would be the same in the two universes. Noting that the CMB perturbation depends 
on h in the form Qh"^, this may be accomplished by shifting h. Because the flat universe 
requires + u!^fi)h~'^ + Q/^ = 1 for the massive neutrino case, and u!mh~^ + = 1 for 
the massless case, h has to be reduced as 



We consider that the massless neutrino theory with these parameter shifts captures the 
main features of the theory with massive neutrinos, at least for the first acoustic peak. In 
fact, as shown in Figure this mock theory reproduces very well the full calculation of 
Hi with massive neutrinos. In the same figure, we also show the two curves calculated by 
adjusting either Um and N,y alone or h alone, which represents, respectively, the effect of 
massive neutrinos as matter or the increase of the vacuum energy. The former curve is fiat 
for < 0.017 and turns down henceforth. These component curves demonstrate how Hi 
is built. 

The second and third peaks are enhanced by free streaming of massive neutrinos P|. 
Ignoring this effect, however, we plot H2 and H3 in Figure 10 for the mock theory we used 
to reproduce Hi. Obviously, they do not give the correct dependence for massive neutrinos, 
underestimating the true values of the changes in H2 and H3 for uji, > 0.017. The effect of 
the potential decay is more prominent in H2 f Figure ITUl (a)) to which the contribution of 
CDM is small but the baryon is the major contributor (see Eq. IT7|l . The increase of H3 is 
partly accounted for by the modification of the term that enters into H3 in Eq. (fTH|) . 
Dodelson et al. ^5J showed that the increase of the second and third peaks is understood by 
the potential decay. We do not pursue our analysis further, as it would not give us more 
insight than that given by Dodelson et al.'s analysis. 

A gentle increase for small oji, in Figure 8 (b) is caused by the increase in the radiation component of the 
neutrino energy density p^^r relative to the matter counterpart p^^m for small to^. Note that p,^^r, defined 
by Eq. H34|l . is not necessarily monotonically decreasing as a function of neutrino mass. 
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FIG. 8: Effective parameters of the massless neutrino theory that are required to mock the massless 
neutrino world. 



V. THE NEUTRINO MASS CONSTRAINT IN NON-FLAT UNIVERSES 

We remove the assumption of Qtot = 1 and study the constraint on the neutrino mass 
in positive and negative curvature universes. We made a minimum search only for a 
few values of u^, close to the upper limit obtained in the flat universe, since the search is 
time-consuming but an upper limit comparable to the one for the flat universe is anticipated 
from an analytic argument. We only consider the universe with f2tot = 1-02, 1.04 (positive 
curvature) and Qtot = 0.98 (negative curvature), which are still allowed from WMAP. The 
solutions that give a minimum are given for = 0, 0.02, 0.025 and 0.03 in Table IIVI 
for the positive curvature case (fitot = 1-02) and in Table |3 for the negative curvature 
case. The minimum is plotted in Figure ^ presented earlier. The figure shows that the 
X^jjj are slightly smaller for the positive curvature and larger for the negative curvature for 
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FIG. 9: Dependence of Hi on Ui, predicted in the mock massless neutrino theory (dashed hne), 
as compared with the true theory with massive neutrinos (sohd hne). For illustration, the results 
with the theories, where only the early integrated Sachs- Wolfe effect is mocked by changing ojm 
and Ni, (dotted line) and only the late Sachs- Wolfe effect is mocked by changing h, are also shown 
(dot-dashed line). 



0.48 



0.47 



0.46 



0.45 



0.44 



T 1 1 1 1 1 r 



(a) 




0.01 0.02 0.03 

COv 



0.49 



0.48 



0.04 



0.45 




0.01 0.02 0.03 0.04 

COv 



FIG. 10: Dependence of (a) H2 and (b) on predicted in the mock massless neutrino theory 
(dashed line), as compared with the true theory with massive neutrinos (solid line). 
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Hi 


H2 


^3 


0.02 


0.0220 


0.139 


0.640 


0.0878 


0.923 


1127.1 


1431.5 


219 


6.23 


0.442 


0.435 


0.025 


0.0220 


0.134 


0.627 


0.0871 


0.912 


1146.9 


1433.5 


219 


6.15 


0.440 


0.428 


0.03 


0.0220 


0.129 


0.624 


0.0790 


0.900 


1171.4 


1435.7 


219 


6.06 


0.439 


0.420 



TABLE V: Solutions for Xmin('^'^) ™ the negative curvature universe with Otot = 0. 



a given ^7^(7^ 0). We find, however, that this does not change the hmit on the neutrino 
mass. For uo^ = 0, the universe of a slightly positive curvature is somewhat more favoured, 
viz. x^(flat,u;,y = 0) — x^(i7tot = 1.02, cj^^ = 0) = 1, as already known in earlier analyses 
0,0]. This decrease of ^'^ the global minimum compensates the decrease of seen at 
Uu ~ 0.02. So, when the likelihood is computed relative to the global minimum in parameter 
space allowing Qt^t to vary, the limit on the neutrino mass remains unchanged. We also find 
that the introduction of massive neutrinos always increases relative to the case of massless 
neutrinos; the presence of massive neutrinos do not modify the limit on the curvature. We 
finally note that the limit on massive neutrinos becomes tightened when Qtot ^ 1.03. 

It is easy to see how the effect of massive neutrinos is modified from the case of the flat 
universe. We first note that the partial derivatives with respect to fltot 



Ail = -360 
AHi = +4.5 



^tot 

An 



tot 



a 



(41) 
(42) 



tot 



The first relation shows the well-known dependence on Q^ot that the last scattering surface 
is magnified in the positive curvature universe. The second relation arises from the late 
integrated Sachs- Wolfe effect. For Qtot > 1 the reduction of the late integrated Sachs- Wolfe 
effect decreases Ciq, and hence increases Hi. H2 and H3 do not depend on Qtot- At a first 
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glance one might suspect that a large response of ii to Qtot for Qtot < 1 would cancel the 
negative change of £i induced by a finite neutrino mass and relax the limit for the negative 
curvature universe. This, however, is not the case. 

The position of ii is tightly constrained by the data. So the change in ii from either 
the massive neutrino or the departure from the fiat space curvature is compensated by the 
change in h that is unconstrained. The negative curvature makes this shift smaller, and 
the positive makes it larger as seen in Figure 2 (c). Note that among the 6 cosmological 
parameters, only h receives a significant change when a small curvature is introduced. All 
other parameters change no more than a few percent from the values for the fiat universe. 
The positive curvature increases Hi via Eq. (j42|) and an extra decrease of h also increases 
Hi. The increase of Hi makes some more allowance to the observational lower limit of Hi, 
which lowers ^^id would in principle weakens the constraint. However, when we remove 
the spatial flatness assumption, the global minimum, realised at cu^ = 0, occurs at a 
smaller than that for the fiat universe. This offsets the decrease of x'^i^u 7^ 0), and we obtain 
the limit on the neutrino mass virtually unchanged from the case for the fiat universe. 

Although the limit on the neutrino mass is formally unchanged in a positive curvature 
universe, the cost is a significant decrease of h as seen in Figure 2 (c). To realise the 2 a 
limit, uj^ ~ 0.021, we are led to Hq ^ 50 km s~^Mpc~^, an unacceptably small value. 

The argument may go in the opposite way for the negative curvature, but the limit on the 
neutrino mass becomes substantially stronger. We calculate Ax"^ in a full non-zero curvature 
parameter space: for negative curvatures Ax'^iuj^, = 0) is already significant relative to the 
global minimum that is realised in a positive curvature universe. 

Note that our discussion does not deal with H2 and H^,, because these quantities depend 
on neither fitot or h directly. The change of these quantities takes place only through the 
adjustment of other parameters, and is small. 

In conclusion we find that the constraint on the neutrino mass we obtained for the flat 
universe uju < 0.021 is unchanged even when a non-zero spatial curvature is allowed. 

VI. CONCLUSION 

We showed that the subelectronvolt upper limit can be derived on the neutrino mass 
from the CMB data alone within the ACDM model with adiabatic perturbations. This 
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is contrary to the statements made in Elgar0y and Lahav [9| and Tegmark et al. [8|, who 
stressed that the large-scale galaxy clustering information is essential to derive the limit 
on the neutrino mass. Assuming the flatness of the universe, the constraint we obtained 
from the one-year data of the WMAP observation alone by maximising the likelihood is 
uOy < 0.021 or 'Y^niij < 2.0 eV at the 95% confidence level (for the degenerate neutrinos, 
which are close to the reality if the neutrino mass is close to the limit, iriy < 0.66 eV). This 



is slightly weaker than the limit < 1.7 eV 



derived by the combined use of WMAP and 



SDSS data, or similar limits that are obtained by combining more different types of data 
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is a robust result in the sense that it does not receive any 
systematics from biasing, non-linear effects and others, and solely determined by the CMB 
data for which systematics are controlled very well. Our constraint is unchanged even if we 
relax the flatness assumption. The inclusion of the tensor perturbation only tightens the 
limit. The assumption we still need is the power-law primordial fluctuation spectrum. 

We argued that it would not be easy to improve the limit beyond ^ rrii, < 1.5 eV using the 
CMB data alone, even if the CMB multipole data are substantially improved. This "critical 
limit" corresponds to the situation when neutrino becomes nonrelativistic at recombination 
epoch. That is, we can derive the constraint when neutrinos become nonrelativistic before 
the recombination epoch. The improvement of the limit on the neutrino mass requires some 
external inputs, most characteristically the lower limit on the Hubble constant, or those that 
effectively leads to the constraint on the Hubble constant, such as the Type la supernova 
Hubble diagram or the large-scale clustering of galaxies. If Hq would receives a firm lower 
limit, say Hq > 65 km s^^Mpc^^, the upper limit on the neutrino mass would be tightened 
to < 0.8 eV. 

We demonstrated the mechanism as to how these constraints are derived, using the re- 
duced CMB observables, ii, Hi, H2 and introduced by 3|, and studying their responses 
to the neutrino mass density. The key point is that £1 and H2 are constrained to narrow 
ranges by observation, and the variation of the cosmological parameters induced by the finite 
neutrino density cannot be accommodated in the error budget of Hi with the increase of 
the neutrino mass beyond ^ m,^ ~ 2 eV. 

We also showed that the response of the reduced CMB observables, in particular li and 
Hi, to the neutrino mass density is understood by the modification of the integrated Sachs- 
Wolfe effect in the presence of massive neutrinos. In addition, free streaming of massive 
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neutrinos promotes the decay of gravitational potential that enhances H2 and H^, whose 
scales are within free streaming p. This leads to the negative correlation between Ug and 
rriiy, in contrast to the positive correlation expected from the suppression of the small scale 
power due to massive neutrinos. 

The most important message from our analysis is that (i) one can derive the upper limit 
on the neutrino mass, which is only slightly weaker than is quoted in the modern literature, 
using the CMB (WMAP) data alone: hereby, one can avoid to make use of the mixed data 
of different quality or with possible systematic effects such as biasing and nonlinear effects 
for galaxies, and (ii) one may improve the limit by a modest amount even when the quality 
of the CMB data is improved, but not much. For a substantial improvement of the limit 
one needs a constraint on the Hubble constant from below. 



APPENDIX A: MULTIDIMENSIONAL x^-MINIMIZATION 

Our problem is to minimise / = x^(^s, i^m, ^b, t, h, A) in 6-dimensional parameter space. 
Since we want to avoid to calculate the derivative, we adopt the Brent method [2^ and 
generalise it to a multidimensional problem. For one dimensional problem the Brent method 
samples 3 points, f{xa),f{xc),f{xh), and draw a parabola that connects the three /'s to 
find the value Xi that give the valley of /. Then f{xi) and the two neighbouring /'s are 
used to find the next parabola and its valley at X2- This process is successively applied until 
desired convergence. 

For multidimensional problem, say, f{x,y,z), we first minimise / with respect to z, by 
applying the Brent method in this direction, with x and y fixed to an arbitrary grid Xa 
and ya- We find successively new z grids Zi{xa,ya), Z2{xa,ya), and eventually reach 
f{xa, ya, Zrain{xa, ya))- Wc ucxt minimise it with respect to y using (y^, y^). We carry out 
the z minimisation for y^ and ?/c, i.e., f{xa,yb,Zmm{xa,yb)) and f{xa,yc,Zmin{xa,yc)), and 
successively adding a new y grid, yi,y2---, while repeating the z minimisation procedure at 
each step; we eventually arrive at f{xa,yrain{xa),z^ia{xa,yniin{xa)))- We repeat the same 
procedure with respect to x. Starting from 

fi^Xa, ymini^Xa) y 2;min(2^a; Z/min('^a) ) ) j 
; ymini^Xh) , 2^min {Xb,ymm{Xb))), 
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fiS^ci l/min('^c)) ^min('^C) J/min('^c) ) ) j 

we finally find 

/'(•^min; l/min('''min) ) ^min('''miin Z/min (•^min) ) ) j 

which is the desired result. 

For our problem of / = ■x^{ns,uJm,^b^'T,h, A), applying the minimisation in the order 
of A, /i, r, t^b, and n^, the final value would be x^('"'s,mm, ^^m,mm, 1^6 ,miin ''"min; ^min; ^min); 
where the omitted arguments are 



^m,min 


'^m,min('^s,min) ; 


(Al) 




^fe,min('^s,min) ^m,min) ; 


(A2) 


''"mill 


''"mill (^s,min) ^m,min) ^6,min) ; 


(A3) 


hmin 


^min('^s,min; ^m,min; ^fe,min; ''"min); 


(A4) 


A ■ 


v4ij-[iji (tT-s jjiin, CJ^i^miii) ^6,min) ''"miiD ^min)- 


(A5) 



We find that this nested one-dimensional minimizations works well for the WMAP 
function and the minimum obtained gives lower than those found by the Markov chain 
Monte Carlo methods given in the literature. A caution is needed for the outermost nest, 
the minimization with respect to n^. We find two minima for a small uj,y. So we apply 
the minimisation procedure for each case separately. If more than one mininum is found 
in the course of intermediate minimisation, we must divide the parameter space and the 
minimisation procedure must be applied separately. We do not find, however, such cases 
other than that quoted above. 

APPENDIX B: COMPARISON OF THE GRID SEARCH AND MCMC 

We compare the likelihood function for uj,y = inferred from the function with those 
obtained by the MCMC given in the literature. In Figure fTD we present C = exp(— 
and C given by Tegmark et al. and Spergel et al. [71] for the variable n^. We see that 
our likelihood function agrees very well with Tegmark et al.'s for Ug < 1.05, but it starts 
deviating for > 1.05, where our likelihood function is much larger, meaning that Tegmark 
et al.'s chain does not find a true local minimum near the second peak. We emphasise that 
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FIG. 11: Likelihood functions (w^ = 0) for n<j estimated from our statistics (solid line), as 
compared with those from MCMC given by the WMAP group (data points with errors) and 
Tegmark et al. (dashed line). The maximum is normalised to unity. 

the relative heights of the two peaks of our C are verified to be close to the 'true' likelihoods 
by marginalising the parameters using the multidimensional integral, as mentioned in the 
text. The likelihood function of Spergel et al. also agrees with the two curves. The difference 
is that they do not get the second peak due to the prior of r < 0.3. 

Figure IT^ demonstrates an example of the distribution of when is fixed at 0.98. 
The figure shows a distribution well fitted with a Gaussian function (exp{ — (cUm — a)^/26^} 
with a = 0.146 and b = 0.0162). Once one requires the parameters to stay close to one of 
the local minima, the distribution is consistent with Gaussian. This is also true for other 4 
parameters. 
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